Semantic interoperability

Semantic Interoperability is a term used in computer science as a synonym for "Computable Semantic Interoperability". In this sense, it is the ability of computer systems to communicate information and have that information properly interpreted by the receiving system in the same sense as intended by the transmitting system. "Proper interpretation" means that the transmitted information will be used appropriately by a receiving computer system because the logical implications derivable from transmitted information will be the same as those that the sending system would derive. Semantic Interoperability requires that any two systems will derive the same inferences from the same information. This term is sometimes also used as a synonym for "General Semantic Interoperability", the ability of computer systems to place information in a public location and have that information properly interpreted by systems whose developers do not know the creators of the information nor the purpose for which it was created.

Semantic Interoperability is also use in a more general sense, as the ability of any communicating entities (not only computers) to share unambiguous meaning. In this broader sense, the sender must be able to reliably transmit all sufficient and necessary information; the receiver must be able to correctly interpret its interlocutor; and both must be aware of, and agree upon, each other behaviors for given interactions.

The main topic of this article is Semantic Interoperability among computer systems, or "Computable Semantic Interoperability".

Contents

Semantic as a function of syntactic and pragmatic interoperability

Syntactic interoperability, provided by for instance XML or the SQL standards, is a pre-requisite to semantic. It involves a common data format and common protocol to structure any data so that the manner of processing the information will be interpretable from the structure. It also allows detection of syntactic errors, thus allowing receiving systems to request resending of any message that appears to be garbled or incomplete. No semantic communication is possible if the syntax is garbled or unable to represent the data. However, information represented in one syntax may in some cases be accurately translated into a different syntax. Where accurate translation of syntaxes is possible, systems using different syntaxes may also interoperate accurately. In some cases the ability to accurately translate information among systems using different syntaxes may be limited to one direction, when the formalisms used have different levels of expressivity (ability to express information).

However, once the syntactical correctness has been verified, the intended meaning of the content of a communication still cannot be judged without some commonality in methods and procedures that each system is employing for semantic interpretation, which goes beyond the syntactic. In other words, the use of the data or context of application must be understood and unambiguously defined. The task of achieving semantic interoperabiilty among computer systems requires the use of a means to assure that, if there is any context sensitivity to the way terms are used, that the context must also be specified as part of the information using those terms.

Semantic interoperability may be achieved in an interactive manner among a limited group of systems that communicate with each other regularly, so as to allow refinement or clarification of meanings that are unclear, or addition of new meanings. Messages like "this doesn't seem to be an appropriate action at this moment" might provide feedback to the semantic interoperability layer of such interacting systems to suggest it has made errors. Over time, also, state of a system may change or the agreements that govern it may change as more and more systems are aligned. Messages that are semantically unambiguous at one time may be ambiguous if viewed later after the range of possible messages becomes more refined.

Just as computer programs are very often specified both from top-down user requirements or design precedents and bottom-up system capabilities (like shared libraries and APIs), semantics of a local system or group of systems in communication with each other often emerge from compromises between its syntax and its pragmatics.

The more ambitious goal of General Semantic Interoperability presupposes that such interactive refinements or clarifications of meaning are not possible. In that case, the meanings of any information available to multiple remote systems must be specified in sufficient detail to resolve any potential ambiguity. This requires the use of some common standard of meaning. The current best technology for achieving that level of precision in specification of meaning is the use of a logical representation at least as expressive a First-Order Logic (FOL)[1] [2] . A common ontology allows all interoperating systems to specify meanings of terms with precision, by linking terms used in specific contexts to the ontology elements that describe the meanings of those terms in logical format.

A single ontology containing representations of every term used in every application is generally considered impossible, because of the rapid creation of new terms or assignments of new meanings to old terms. However, though it is impossible to anticipate every concept that a user may wish to represent in a computer, there is the possibility of finding some finite set of "primitive" concept representations that can be combined to create any of the more specific concepts that users may need for any given set of applications or ontologies. Having a foundation ontology (also called upper ontology) that contains all those primitive elements would provide a sound basis for general semantic interoperability, and allow users to define any new terms they need by using the basic inventory of ontology elements, and still have those newly-defined terms properly interpreted by any other computer system that can interpret the basic foundation ontology. Whether the number of such primitive concept representations is in fact finite, or will expand indefinitely, is a question under active investigation. If it is finite, then a stable foundation ontology suitable to support accurate and general semantic interoperability can evolve after some initial foundation ontology has been tested and used by a wide variety of users. At the present time, no foundation ontology has been adopted by a wide community, so such a stable foundation ontology is still in the future.

Words and Meanings

One persistent misunderstanding recurs in discussion of semantics - the confusion of words and meanings. The meanings of words change, sometimes rapidly. But a formal language such as used in an ontology can encode the meanings (semantics) of concepts in a form that does not change. In order to determine what is the meaning of a particular word (or term in a database, for example) it is necessary to label each fixed concept representation in an ontology with the word(s) or term(s) that may refer to that concept. When multiple words refer to the same (fixed) concept, in language this is called synonymy; when one word is used to refer to more than one concept, that is called ambiguity. Ambiguity and synonymy are among the factors that make computer understanding of language very difficult. The use of words to refer to concepts (the meanings of the words used)is very sensitive to the context and the purpose of any use for many human-readable terms. The use of ontologies in supporting semantic interoperability is to provide a fixed set of concepts whose meanings and relations are stable and can be agreed to by users. The task of determining which terms in which contexts (each database is a different context) then is separated from the task of creating the ontology, and must be taken up by the designer of a database, or the designer of a form for data entry, or the developer of a program for language understanding. When a word used in some interoperability context changes its meaning, then to preserve interoperability it is necessary to change the pointer to the ontology element(s) that specifies the meaning of that word.

An example may clarify the point above. An initial ontology used for interoperability may specify that every instance of type 'Automobile' must have four supporting wheels. Later it is learned that automobiles exist or are developed that have only three wheels. Such three-wheeled automobiles will not be instances of that initial ontology concept. There are two ways to remedy the problem. The original ontology concept may be changed to include both three or four-wheeled versions, but that could cause errors in legacy systems that depend on the original meaning of the concept. Instead, one would preferably create a parent (more general) type in the ontology that includes both three and four-wheeled vehicles, and the original 'Automobile' type would then be made a subtype of that new type. To avoid name clashes, that new concept may be called, e.g. "GenericAutomobile". The documentation of the original concept would be changed to alert users that a broader similar concept representation ('GenericAutomobile')is now available that includes three-wheeled vehicles. The linguistic term 'automobile' in any local terminology could be mapped to the more general type (if three-wheeled automobiles were intended), or optionally to the more restricted type having only four wheels, if that local community only wants to talk about four-wheeled automobiles. Making any change in an ontology can cause problems for legacy users; in this case, some users may want the meaning of 'automobile' to include three-wheeled automobiles, but used the original ontology concept because it was the closest concept available. When the new more general concept is added, all users need to be informed of the change so they can determine if they need to change the pointers from their local terms or data elements to the new ontology element.

This scenario illustrates the importance of trying to achieve stability in the ontology used to specify the meanings of terms used in applications, so as to achieve accurate semantic interoperability. For situations where an ontology is developed for use by a closely interacting group of users, direct communication of needs and requirements can assure that the ontology contains representations of all the concepts required by the users. For the general case of semantic interoperability among a wide range of users, who do not have direct contact with each other, the need to anticipate the requirements of users with many different purposes suggests the need to develop some common ontology, in consultation with a broad diversity of potential users, that has the capability of representing any of the concepts those users need, by combining the primitive concept representations. That is the purpose of a Foundation Ontology with a complete set of representations of the semantic primitives, as described in the previous section.

Partial semantic interoperability

To achieve perfect semantic interoperability, all communicating systems must use term (or symbol) definitions that are identical or can be accurately interconverted. Thus a common ontology is the ideal situation for semantic interoperability. Where that is impossible, lesser degrees of semantic interoperability may be achieved by techniques that automatically map the definitions used by one system to those of another.

Interoperability is sometimes considered as an all-or-nothing attribute of computer systems - see upper ontology - but for complex information, different levels of interoperability can be envisioned; when multiple pieces of information are being transferred, correct interpretation of some fraction of that information may be considered as constituting some level of semantic interoperability. Perfect semantic interoperability would require the correct interpretation of all transferred information, from the point of view of all users.

Knowledge representation requirements and languages

A knowledge representation language may be sufficiently expressive to describe nuances of meaning in well understood fields. There are at least five levels of complexity of these.

For general semi-structured data one may use a general purpose language such as XML[3].

For structured data with well understood relationships one may use SQL or the relational model.

A description logic (such as the one used in the OWL semantic web ontology language[4]) is more complex.

Languages with the full power of first-order predicate logic may be required for many tasks.

Human languages are highly expressive, but are considered too ambiguous to allow the accurate interpretation desired, given the current level of human language technology. In human languages the same word may be used to refer to different concepts (ambiguity), and the same concept may be referred to by different words (synonymy).

Prior agreement not required

Semantic interoperability may be distinguished from other forms of interoperability by considering whether the information transferred has, in its communicated form, all of the meaning required for the receiving system to interpret it correctly, even when the algorithms used by the receiving system are unknown to the sending system. Consider sending one number:

If that number is intended to be the sum of money owed by one company to another, it implies some action or lack of action on the part of both those who send it and those who receive it.

It may be correctly interpreted if sent in response to a specific request, and received at the time and in the form expected. This correct interpretation does not depend only on the number itself, which could represent almost any of millions of types of quantitative measure, rather it depends strictly on the circumstances of transmission. That is, the interpretation depends on both systems expecting that the algorithms in the other system use the number in exactly the same sense, and it depends further on the entire envelope of transmissions that preceded the actual transmission of the bare number. By contrast, if the transmitting system does not know how the information will be used by other systems, it is necessary to have a shared agreement on how information with some specific meaning (out of many possible meanings) will appear in a communication. For a particular task, one solution is to standardize a form, such as a request for payment; that request would have to encode, in standardized fashion, all of the information needed to evaluate it, such as: the agent owing the money, the agent owed the money, the nature of the action giving rise to the debt, the agents, goods, services, and other participants in that action; the time of the action; the amount owed and currency in which the debt is reckoned; the time allowed for payment; the form of payment demanded; and other information. When two or more systems have agreed on how to interpret the information in such a request, they can achieve semantic interoperability for that specific type of transaction. For semantic interoperability generally, it is necessary to provide standardized ways to describe the meanings of many more things than just commercial transactions, and the number of concepts whose representation needs to be agreed upon are at a minimum several thousand.

Ontology research

How to achieve semantic interoperability for more than a few restricted scenarios is currently a matter of research and discussion. For the problem of General Semantic Interoperability, some form of foundation ontology ('upper ontology') is required that is sufficiently comprehensive to provide the defining concepts for more specialized ontologies in multiple domains. Over the past decade more than ten foundation ontologies have been developed, but none have as yet been adopted by a wide user base.

The need for a single comprehensive all-inclusive ontology to support Semantic Interoperability can be avoided by designing the common foundation ontology as a set of basic ("primitive") concepts that can be combined to create the logical descriptions of the meanings of terms used in local domain ontologies or local databases. This tactic is based on the principle that:

If:

(1) the meanings and usage of the primitive ontology elements in the foundation ontology are agreed on, and 
(2) the ontology elements in the  domain ontologies are constructed as logical
combinations of the elements in the foundation ontology,

Then:

The intended meanings of the domain ontology elements can be computed automatically using an FOL reasoner, by any system that accepts the meanings of the elements in the foundation ontology, and has both the foundation ontology and the logical specifications of the elements in the domain ontology.

Therefore:

Any system wishing to interoperate accurately with another system need transmit only the data to be communicated, plus any logical descriptions of terms used in that data that were created locally and are not already in the common foundation ontology.

This tactic then limits the need for prior agreement on meanings to only those ontology elements in the common Foundation Ontology (FO). Based on several considerations, this is likely to be fewer than 10,000 elements (types and relations).

In practice, together with the FO focused on representations of the primitive concepts, a set of domain extension ontologies to the FO with elements specified using the FO elements will likely also be used. Such pre-existing extensions will ease the cost of creating domain ontologies by providing existing elements with the intended meaning, and will reduce the chance of error by using elements that have already been tested. Domain extension ontologies may be logically inconsistent with each other, and that needs to be determined if different domain extensions are used in any communication.

Whether use of such a single foundation ontology can itself be avoided by sophisticated mapping techniques among independently developed ontologies is also under investigation. See upper ontology for further description of research in these fields.

Importance

The practical significance of semantic interoperability has been measured by several studies that estimate the cost (in lost efficiency) due to lack of semantic interoperability. One study[5], focusing on the lost efficiency in the communication of healthcare information, estimated that US$77.8 billion per year could be saved by implementing an effective interoperability standard in that area. Other studies, of the construction industry[6] and of the automobile manufacturing supply chain[7], estimate costs of over US$10 billion per year due to lack of semantic interoperability in those industries. In total these numbers can be extrapolated to indicate that well over US$100 billion per year is lost because of the lack of a widely used semantic interoperability standard in the US alone.

There has not yet been a study about each policy field that might offer big cost savings applying semantic interoparability standards. But to see which policy fields are capable of profiting from semantic interoperability see 'Interoperability' in general. Such policy fields are eGovernment, health, security and many more. The EU also set up the Semantic Interoperability Centre Europe in June 2007.

See also

External links

References